A method for identifying user interface (UI) objects in a markup-language stream, the 
method comprising the steps of: 

A) scanning any of (i) the markup-language stream and (ii) a corresponding document 
object model (DOM) to generate tokens, 

B) parsing the tokens based on a grammar to identify one or more UI objects. 

The method of claim 1, wherein said marlcup-language stream drives a markup -language- 
based browser application, and wherein the scanning step includes scanning the DOM 
generated by a browser that displays that application. 

The method of claim 1, wherein the scanning step includes identifying elements of the 
DOM by traversal thereof. 

The method of claim 3, wherein the grammar is application-specific. 

The method of claim 3, wherein the scanning step includes generating one or more tokens 
for each parsed DOM element. 

The method of claim 3, wherein scanning step includes mapping DOM elements to 
tokens. 

The method of claim 1, wherein the parsing step includes parsing the tokens according to 
the grammar to identify and distinguish among UI objects in the markup -language 
stream. 

The method of claim 7, wherein said UI objects comprise user input fields, text fields, 
metatags, unprintable markup-language, and in-line images. 
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The method of claim 1, wherein the scamiing and parsing steps are adapted to identify UI 
objects that correspond to elements displayed in the markup -language application. 

The method of claim 9, wherein said parsing groups the tokens into syntactic structures 
that identify items displayed by the markup -language application. 

The method of claim 9, wherein said step of scanning can include identifying similarly 
formatted markup -language elements based on their markup -language attributes such as 
classname, font size, style, tag color, and size. 

The method of claim 9, wherein said objects comprise name, content, shape, location, and 
properties. 

The method of claim 1, wherein said parser is built using automated tools such as YACC 
(yet another compiler-compiler). 

The method of claim 13, wherein said parser is built by an automated parser generator 
tool that accepts a source input file containing a predefined grammar. 

The method of claim 13, wherein said parser is built manually by hand-programming. 

The method of claim 14, wherein the parsing is conducted by a LALR(1) parser. 

The method of claim 14, wherein the parsing is conducted by a LR(1) parser. 

The method of any of claims 1-17, wherein the markup language is any of HTML, 
XHTML and XUL. 

A system for identifying user interface (UI) objects in markup-language-based 
applications comprising: 
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a scanner receiving markup-language DOM and generating one or more tokens for each 
DOM element, and 

a parser coupled to the scanner receiving said tokens, and parsing said tokens based on a 
grammar, and generating a list of UI objects. 

The system of claim 19, wherein the list of UI objects corresponds to elements displayed 
by the markup -language DOM. 

The system of claim 20, wherein said UI objects comprise name, content, shape, location, 
and properties. 

The system of claim 19, wherein the grammar is application-specific. 

The system of claim 19, wherein said tokens are interpreted according to the grammar to 
identify and distinguish among UI objects of a markup -language application's display. 

The system of claim 23, wherein the UI objects comprise user input fields, text fields, 
metatags, unprintable markup-language, and in-line images. 

The system of any of claims 18-24, wherein the markup language is any of HTML, 
XHTML and XUL. 
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